Search CORE

128 research outputs found

Testing the suitability of polynomial models in errors-in-variables problems

Author: Hall Peter
Ma Yanyuan
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

A low-degree polynomial model for a response curve is used commonly in practice. It generally incorporates a linear or quadratic function of the covariate. In this paper we suggest methods for testing the goodness of fit of a general polynomial model when there are errors in the covariates. There, the true covariates are not directly observed, and conventional bootstrap methods for testing are not applicable. We develop a new approach, in which deconvolution methods are used to estimate the distribution of the covariates under the null hypothesis, and a ``wild'' or moment-matching bootstrap argument is employed to estimate the distribution of the experimental errors (distinct from the distribution of the errors in covariates). Most of our attention is directed at the case where the distribution of the errors in covariates is known, although we also discuss methods for estimation and testing when the covariate error distribution is estimated. No assumptions are made about the distribution of experimental error, and, in particular, we depart substantially from conventional parametric models for errors-in-variables problems.Comment: Published in at http://dx.doi.org/10.1214/009053607000000361 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Texas A&M Repository

Variable selection in measurement error models

Author: Li Runze
Ma Yanyuan
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 23/02/2010
Field of study

Measurement error data or errors-in-variable data have been collected in many studies. Natural criterion functions are often unavailable for general functional measurement error models due to the lack of information on the distribution of the unobservable covariates. Typically, the parameter estimation is via solving estimating equations. In addition, the construction of such estimating equations routinely requires solving integral equations, hence the computation is often much more intensive compared with ordinary regression models. Because of these difficulties, traditional best subset variable selection procedures are not applicable, and in the measurement error model context, variable selection remains an unsolved issue. In this paper, we develop a framework for variable selection in measurement error models via penalized estimating equations. We first propose a class of selection procedures for general parametric measurement error models and for general semi-parametric measurement error models, and study the asymptotic properties of the proposed procedures. Then, under certain regularity conditions and with a properly chosen regularization parameter, we demonstrate that the proposed procedure performs as well as an oracle procedure. We assess the finite sample performance via Monte Carlo simulation studies and illustrate the proposed methodology through the empirical analysis of a familiar data set.Comment: Published in at http://dx.doi.org/10.3150/09-BEJ205 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

Crossref

Optimal variance estimation without estimating the mean function

Author: Ma Yanyuan
Tong Tiejun
Wang Yuedong
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 01/01/2013
Field of study

We study the least squares estimator in the residual variance estimation context. We show that the mean squared differences of paired observations are asymptotically normally distributed. We further establish that, by regressing the mean squared differences of these paired observations on the squared distances between paired covariates via a simple least squares procedure, the resulting variance estimator is not only asymptotically normal and root-

n

consistent, but also reaches the optimal bound in terms of estimation variance. We also demonstrate the advantage of the least squares estimator in comparison with existing methods in terms of the second order asymptotic properties.Comment: Published in at http://dx.doi.org/10.3150/12-BEJ432 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

CiteSeerX

Fused kernel-spline smoothing for repeatedly measured outcomes in a generalized partially linear model with functional single index

Author: Jiang Fei
Ma Yanyuan
Wang Yuanjia
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2015
Field of study

We propose a generalized partially linear functional single index risk score model for repeatedly measured outcomes where the index itself is a function of time. We fuse the nonparametric kernel method and regression spline method, and modify the generalized estimating equation to facilitate estimation and inference. We use local smoothing kernel to estimate the unspecified coefficient functions of time, and use B-splines to estimate the unspecified function of the single index component. The covariance structure is taken into account via a working model, which provides valid estimation and inference procedure whether or not it captures the true covariance. The estimation method is applicable to both continuous and discrete outcomes. We derive large sample properties of the estimation procedure and show a different convergence rate for each component of the model. The asymptotic properties when the kernel and regression spline methods are combined in a nested fashion has not been studied prior to this work, even in the independent data case.Comment: Published at http://dx.doi.org/10.1214/15-AOS1330 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

PubMed Central

HKU Scholars Hub

A SPLINE-ASSISTED SEMIPARAMETRIC APPROACH TO NONPARAMETRIC MEASUREMENT ERROR MODELS

Author: Jiang Fei
Ma Yanyuan
Publication venue: Collection of Biostatistics Research Archive
Publication date: 01/03/2018
Field of study

Nonparametric estimation of the probability density function of a random variable measured with error is considered to be a difficult problem, in the sense that depending on the measurement error prop- erty, the estimation rate can be as slow as the logarithm of the sample size. Likewise, nonparametric estimation of the regression function with errors in the covariate suffers the same possibly slow rate. The traditional methods for both problems are based on deconvolution, where the slow convergence rate is caused by the quick convergence to zero of the Fourier transform of the measurement error density, which, unfortunately, appears in the denominators during the construction of these methods. Using a completely different approach of spline-assisted semiparametric methods, we are able to construct nonparametric estimators of both density functions and regression mean functions that achieve the same nonparametric convergence rate as in the error free case. Other than requiring the error-prone variable distribution to be compactly supported, our assumptions are not stronger than in the classical deconvolution literatures. The performance of these methods are demonstrated through some simulations and a data example

Collection Of Biostatistics Research Archive

Testing for high-dimensional white noise

Author: Feng Long
Liu Binghui
Ma Yanyuan
Publication venue
Publication date: 05/11/2022
Field of study

Testing for multi-dimensional white noise is an important subject in statistical inference. Such test in the high-dimensional case becomes an open problem waiting to be solved, especially when the dimension of a time series is comparable to or even greater than the sample size. To detect an arbitrary form of departure from high-dimensional white noise, a few tests have been developed. Some of these tests are based on max-type statistics, while others are based on sum-type ones. Despite the progress, an urgent issue awaits to be resolved: none of these tests is robust to the sparsity of the serial correlation structure. Motivated by this, we propose a Fisher's combination test by combining the max-type and the sum-type statistics, based on the established asymptotically independence between them. This combination test can achieve robustness to the sparsity of the serial correlation structure,and combine the advantages of the two types of tests. We demonstrate the advantages of the proposed test over some existing tests through extensive numerical results and an empirical analysis.Comment: 84 page

arXiv.org e-Print Archive